Background: Patients with cancer have an elevated risk of developing venous thromboembolism (VTE), a common source of morbidity and mortality. Neutrophil Extracellular Traps (NET's) are thought to contribute to the hypercoagulable state of malignancy. NET's are classified as smudge cells or artifacts using the CellaVision analyzer (CellaVision, Sweden), an automated system for performing hematology laboratory differentials. New VTE biomarkers could potentially improve risk estimation for this subpopulation of patients. We hypothesized that cellular morphology of white blood cells (WBC's), especially smudge cells and artifacts, captured by imaging would be predictive of VTE in individuals with cancer.
Methods: The cohort consisted of Memorial Sloan Kettering Cancer Center patients with solid tumors and hematological malignancies who had at least one peripheral blood smear scanned by the CellaVision analyzer between 2019 and 2024. VTE events consisting of lower extremity deep vein thrombosis or pulmonary embolism were flagged using the CEDARS+PINES natural language processing platform (Mantha et al 2024). WBC images were obtained from the center's clinical CellaVision database. Observation times were discretized to intervals of 7 days to make predictions. Data was partitioned into training, validation, and test sets (68% train, 12% validation, 20% test). The area under the receiver operating characteristic curve (AUROC) was used to evaluate model performance, with an AUROC of 0.5 being consistent with random chance. The proportions of each WBC type were calculated for each slide and compared using the Mann-Whitney U test. A ResNeXt-based classifier was trained on 20,000 CellaVision images with 18 classes. We used this network to generate a feature embedding for each WBC image instance in the selected cohort of patients. A gated-attention-based multiple instance learning model was trained to predict VTE development within the designated time period, using the previously derived slide-specific WBC image embeddings. Control undersampling was used to ensure class balance during training. Models were trained on patients with hematological malignancies, solid tumors, and the combined cohort. Attention scores were generated with these models, indicating the relative weight of each cell to the VTE prediction. The higher the attention score for any given WBC picture, the larger the contribution to the model output. To analyze attention scores, we used a two-proportion Z-test to compare the proportions of different cell morphologies between the top 10% of scores among correctly identified VTE cases and all attention scores.
Results: The cohort contained 53,789 patients (41,598 solid tumors, 12,191 hematological malignancies). Within 7 days, 411 patients (0.8%) developed a VTE event. In patients with solid tumors, slides from those who developed VTE had increased proportions of lymphocytes (p < 0.01). For patients with heme malignancies, slides from those who developed VTE were enriched with lymphocytes and eosinophils (p < 0.01). All our models predicting VTE at 7 days exhibited an AUROC significantly higher than 0.50. The heme malignancies cohort model yielded an AUROC of 0.72 (95% CI: 0.60-0.85), while the AUROC for the model trained on the solid tumor cohort was 0.70 (95% CI: 0.65-0.76). This is compared to an AUROC of 0.68 (95% CI: 0.61-0.74) for the combined cohort. In the model predicting VTE using the combined cohort, top attention scores showed higher proportions of smudge cells, giant thrombocytes, and lymphocytes (p < 0.01). When predicting VTE among patients with heme malignancies, smudge cells and basophils were the top morphologies disproportionately represented among the cells assigned top attention scores (p < 0.01). In the group of patients with solid tumors, band neutrophils, eosinophils, monocytes, and smudge cells were overrepresented among the cells with the highest attention scores (p < 0.01).
Conclusions: Information derived from peripheral white blood cell images embeddings through the use of multiple instance deep learning could potentially help improve risk estimation models for cancer-associated VTE. Further research is required to examine if these models add prognostic value. If so, this approach could be utilized to help determine the need for pharmacological VTE prophylaxis in patients with cancer.
Zwicker:Calyx: Consultancy; UpToDate: Patents & Royalties; Med Learning Group: Consultancy; BMS: Consultancy; Regeneron: Consultancy; Incyte Corporation: Research Funding; Quercegen: Research Funding; Parexel: Consultancy. Dogan:AstraZeneca: Research Funding. Mantha:Janssen Pharmaceuticals: Consultancy.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal